class ConvNet(nn.Module):
def __init__(self):
# model layers
def forward(self, x):
# model structure
nn.Conv2d
跟 nn.MaxPool2d
,那位啥這邊是使用 MaxPool 的原因我們昨天有解釋過了,那我們先看 code 在解釋一下裡面的參數import torch
import torch.nn as nn
import torch.nn.functional as F
class ConvNet(nn.Module):
def __init__(self):
super(ConvNet, self).__init__()
# image shape is 1 * 28 * 28, where 1 is one color channel
# 28 * 28 is the image size
self.conv1 = nn.Conv2d(in_channels=1, out_channels=3, kernel_size=5) # output shape = 3 * 24 * 24
self.pool = nn.MaxPool2d(kernel_size=2, stride=2) # output shape = 3 * 12 * 12
# intput shape is 3 * 12 * 12
self.conv2 = nn.Conv2d(in_channels=3, out_channels=9, kernel_size=5) # output shape = 9 * 8 * 8
# add another max pooling, output shape = 9 * 4 * 4
self.fc1 = nn.Linear(9*4*4, 100)
self.fc2 = nn.Linear(100, 50)
# last fully connected layer output should be same as classes
self.fc3 = nn.Linear(50, 10)
def forward(self, x):
# first conv
x = self.pool(F.relu(self.conv1(x)))
# second conv
x = self.pool(F.relu(self.conv2(x)))
# flatten all dimensions except batch
x = torch.flatten(x, 1)
# fully connected layers
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
in_channels
:輸入的圖片通道數,這邊大部分會是圖片的顏色狀況,例如灰階圖片只有一個通道,RGB 的圖片會有三個通道out_channels
:輸出的通道數,這邊基本上沒有特別的限制,可以視自己喜好給予,只要記住往下一層送時,這邊會變成另外一層的 in_channels
kernel_size
:這也就是我們的 feature filter 的 size,有兩種表達參數的方式,
kernel_size = (num1, num2)
就會是 num1 * num2 的 feature filter,請注意,feature filter 並沒有規定一定要是正方形,因此可以視喜好調整stride
:在 feature filter 遍歷圖片時,每次移動的步數可以不是一格一格移動,此參數就會設定每次卷積移動的步伐大小,那預設會是 1,因此我們上面沒有特別調整kernal_size
:同上敘述的,也就是 pooling 每次要覆蓋的資料量stride
:同上敘述,也就是池化每次的步數